Datascience para el Bien Social
  • Home
  • Categories
  • Tags
  • Archives

Explorando las tendencias que se siguen en la Celebracion de Accion de gracias.

Explorando las tendencias que se siguen en la Celebracion de Accion de gracias.¶

En este projecto vamos a explorar las tendecias que la gente sigue en accion de gracias.¶

  • #### Primero, vamos a ver que platos y postres son los favoritos entre la gente.

  • #### En segundo lugar, vamos a obtener los ingresos y ver como esto esta correlacionado con la lejania desde la que la gente viaja para celebrar Accion de Gracias.

  • #### Finalmente, vamos a concluir nuestro projecto uniendo celebracion, amigos y observando que rango de edad tiende a celebrar mas accion de gracias.

In [1]:
import pandas as pd

dataset = pd.read_csv('https://raw.githubusercontent.com/fivethirtyeight/data/master/thanksgiving-2015/thanksgiving-2015-poll-data.csv', encoding='latin1')
print(dataset.head(3))
   RespondentID Do you celebrate Thanksgiving?  \
0    4337954960                            Yes   
1    4337951949                            Yes   
2    4337935621                            Yes   

  What is typically the main dish at your Thanksgiving dinner?  \
0                                             Turkey             
1                                             Turkey             
2                                             Turkey             

  What is typically the main dish at your Thanksgiving dinner? - Other (please specify)  \
0                                                NaN                                      
1                                                NaN                                      
2                                                NaN                                      

  How is the main dish typically cooked?  \
0                                  Baked   
1                                  Baked   
2                                Roasted   

  How is the main dish typically cooked? - Other (please specify)  \
0                                                NaN                
1                                                NaN                
2                                                NaN                

  What kind of stuffing/dressing do you typically have?  \
0                                        Bread-based      
1                                        Bread-based      
2                                         Rice-based      

  What kind of stuffing/dressing do you typically have? - Other (please specify)  \
0                                                NaN                               
1                                                NaN                               
2                                                NaN                               

  What type of cranberry saucedo you typically have?  \
0                                               None   
1                             Other (please specify)   
2                                           Homemade   

  What type of cranberry saucedo you typically have? - Other (please specify)  \
0                                                NaN                            
1                    Homemade cranberry gelatin ring                            
2                                                NaN                            

          ...          \
0         ...           
1         ...           
2         ...           

  Have you ever tried to meet up with hometown friends on Thanksgiving night?  \
0                                                Yes                            
1                                                 No                            
2                                                Yes                            

  Have you ever attended a "Friendsgiving?"  \
0                                        No   
1                                        No   
2                                       Yes   

  Will you shop any Black Friday sales on Thanksgiving Day?  \
0                                                 No          
1                                                Yes          
2                                                Yes          

  Do you work in retail? Will you employer make you work on Black Friday?  \
0                     No                                              NaN   
1                     No                                              NaN   
2                     No                                              NaN   

  How would you describe where you live?      Age What is your gender?  \
0                               Suburban  18 - 29                 Male   
1                                  Rural  18 - 29               Female   
2                               Suburban  18 - 29                 Male   

  How much total combined money did all members of your HOUSEHOLD earn last year?  \
0                                 $75,000 to $99,999                                
1                                 $50,000 to $74,999                                
2                                       $0 to $9,999                                

            US Region  
0     Middle Atlantic  
1  East South Central  
2            Mountain  

[3 rows x 65 columns]
In [3]:
header = dataset.columns.unique()
In [4]:
cel_thanks_count = dataset["Do you celebrate Thanksgiving?"].value_counts()
cel_thanks = dataset[dataset["Do you celebrate Thanksgiving?"] == 'Yes']

print(cel_thanks.head(3))
   RespondentID Do you celebrate Thanksgiving?  \
0    4337954960                            Yes   
1    4337951949                            Yes   
2    4337935621                            Yes   

  What is typically the main dish at your Thanksgiving dinner?  \
0                                             Turkey             
1                                             Turkey             
2                                             Turkey             

  What is typically the main dish at your Thanksgiving dinner? - Other (please specify)  \
0                                                NaN                                      
1                                                NaN                                      
2                                                NaN                                      

  How is the main dish typically cooked?  \
0                                  Baked   
1                                  Baked   
2                                Roasted   

  How is the main dish typically cooked? - Other (please specify)  \
0                                                NaN                
1                                                NaN                
2                                                NaN                

  What kind of stuffing/dressing do you typically have?  \
0                                        Bread-based      
1                                        Bread-based      
2                                         Rice-based      

  What kind of stuffing/dressing do you typically have? - Other (please specify)  \
0                                                NaN                               
1                                                NaN                               
2                                                NaN                               

  What type of cranberry saucedo you typically have?  \
0                                               None   
1                             Other (please specify)   
2                                           Homemade   

  What type of cranberry saucedo you typically have? - Other (please specify)  \
0                                                NaN                            
1                    Homemade cranberry gelatin ring                            
2                                                NaN                            

          ...          \
0         ...           
1         ...           
2         ...           

  Have you ever tried to meet up with hometown friends on Thanksgiving night?  \
0                                                Yes                            
1                                                 No                            
2                                                Yes                            

  Have you ever attended a "Friendsgiving?"  \
0                                        No   
1                                        No   
2                                       Yes   

  Will you shop any Black Friday sales on Thanksgiving Day?  \
0                                                 No          
1                                                Yes          
2                                                Yes          

  Do you work in retail? Will you employer make you work on Black Friday?  \
0                     No                                              NaN   
1                     No                                              NaN   
2                     No                                              NaN   

  How would you describe where you live?      Age What is your gender?  \
0                               Suburban  18 - 29                 Male   
1                                  Rural  18 - 29               Female   
2                               Suburban  18 - 29                 Male   

  How much total combined money did all members of your HOUSEHOLD earn last year?  \
0                                 $75,000 to $99,999                                
1                                 $50,000 to $74,999                                
2                                       $0 to $9,999                                

            US Region  
0     Middle Atlantic  
1  East South Central  
2            Mountain  

[3 rows x 65 columns]

2) Explorando los platos principales que la gente tiene a comer durante la cena de Accion de gracias.¶

In [5]:
thanks_dishes_count = dataset['What is typically the main dish at your Thanksgiving dinner?'].value_counts()
print('         Main Dish \n', thanks_dishes_count)

tofurkey_dish = dataset[dataset['What is typically the main dish at your Thanksgiving dinner?'] == 'Tofurkey']
tofurkey_have_gravy = tofurkey_dish['Do you typically have gravy?']
print(tofurkey_have_gravy)
         Main Dish 
 Turkey                    859
Other (please specify)     35
Ham/Pork                   29
Tofurkey                   20
Chicken                    12
Roast beef                 11
I don't know                5
Turducken                   3
Name: What is typically the main dish at your Thanksgiving dinner?, dtype: int64
4      Yes
33     Yes
69      No
72      No
77     Yes
145    Yes
175    Yes
218     No
243    Yes
275     No
393    Yes
399    Yes
571    Yes
594    Yes
628     No
774     No
820     No
837    Yes
860     No
953    Yes
Name: Do you typically have gravy?, dtype: object

Resultados:¶

  • 20 tomaron Tofurkey for cenar.
  • De estos, 12 usaron gravy (salsa de reino unido) con el Tofurkey.

3) Veamos ahora que pasteles comio la gente durante esta celebración.¶

In [6]:
apple_isnull = pd.isnull(dataset['Which type of pie is typically served at your Thanksgiving dinner? Please select all that apply. - Apple'])
pumpkin_isnull = pd.isnull(dataset['Which type of pie is typically served at your Thanksgiving dinner? Please select all that apply. - Pumpkin'])
pecan_isnull = pd.isnull(dataset['Which type of pie is typically served at your Thanksgiving dinner? Please select all that apply. - Pecan'])

ate_pies = apple_isnull & pumpkin_isnull & pecan_isnull
print(ate_pies.value_counts())
False    876
True     182
dtype: int64

4) Convertiendo la edad a numeros.¶

In [17]:
def ext_age(age):
    if pd.isnull(age):
        return None
    
    age_split = age.split(' ')
    age_splited = age_split[0].replace('+', '')
    return int(age_splited)
        
example = ext_age(dataset['Age'][14])  

dataset['int_age'] = dataset['Age'].apply(ext_age)

dataset['int_age'].describe()
Out[17]:
count    1025.000000
mean       39.383415
std        15.398493
min        18.000000
25%        30.000000
50%        45.000000
75%        60.000000
max        60.000000
Name: int_age, dtype: float64

Resultados:¶

  • As we can observe, with the describe method we have some measurements on the ages we are working with.
  • Como podemos observar, con el metodo descrito podemos hacer mediciones sobre la edad para trabajar con esta.

  • Tenemos que estar atendos a que las edades no son exactas porque nuestros datos originales estaba compuesto por un rango y nosotros estamos cogiendo siempre el minimo de ese rango. Es decir, las edad no son verdaderas pero si una aproximacion.

In [7]:
dataset['Age'].value_counts()
Out[7]:
45 - 59    286
60+        264
30 - 44    259
18 - 29    216
Name: Age, dtype: int64

5) Convirtiendo los ingresos a numerico.¶

In [19]:
dataset['How much total combined money did all members of your HOUSEHOLD earn last year?'].value_counts()

#Function to convert income_string to a data value.
def ext_income(income):
    if pd.isnull(income):
        return None
    
    income_split = income.split(' ')
    if income_split[0] == 'Prefer':
        return None
    income_splitted = income_split[0].replace("$", "").replace(",", "")
    
    return int(income_splitted)

example1 = dataset['How much total combined money did all members of your HOUSEHOLD earn last year?'][0].split(' ')

dataset['int_income']  = dataset['How much total combined money did all members of your HOUSEHOLD earn last year?'].apply(ext_income)
dataset['int_income'].describe()
Out[19]:
count       889.000000
mean      74077.615298
std       59360.742902
min           0.000000
25%       25000.000000
50%       50000.000000
75%      100000.000000
max      200000.000000
Name: int_income, dtype: float64

Resultados:¶

  • Debemos estar atentos a que el rango salarial, al igual que ocurria con la edad, es un rango y por tanto estamos cogiendo el minimo valor de ese rango.

  • Podemos apreciar como la media de salario es bastante alta. Tambien tenemos una desviacion estandar bastante elevada.

6) Correacionando la distancia viajada y los ingresos.¶

In [20]:
dataset['How far will you travel for Thanksgiving?'].value_counts()
Out[20]:
Thanksgiving is happening at my home--I won't travel at all                         396
Thanksgiving is local--it will take place in the town I live in                     276
Thanksgiving is out of town but not too far--it's a drive of a few hours or less    197
Thanksgiving is out of town and far away--I have to drive several hours or fly       82
Name: How far will you travel for Thanksgiving?, dtype: int64
In [21]:
dataset['How far will you travel for Thanksgiving?'][dataset['int_income'] < 150000].value_counts()
Out[21]:
Thanksgiving is happening at my home--I won't travel at all                         281
Thanksgiving is local--it will take place in the town I live in                     203
Thanksgiving is out of town but not too far--it's a drive of a few hours or less    150
Thanksgiving is out of town and far away--I have to drive several hours or fly       55
Name: How far will you travel for Thanksgiving?, dtype: int64
In [22]:
dataset['How far will you travel for Thanksgiving?'][dataset['int_income'] > 150000].value_counts()
Out[22]:
Thanksgiving is happening at my home--I won't travel at all                         49
Thanksgiving is local--it will take place in the town I live in                     25
Thanksgiving is out of town but not too far--it's a drive of a few hours or less    16
Thanksgiving is out of town and far away--I have to drive several hours or fly      12
Name: How far will you travel for Thanksgiving?, dtype: int64

Resultados:¶

  • Gente con ingresos por debajo de 150000 esta mas acostumbrada a celebrar Accion de gracias en casa y menos habituada a salir de la ciudad o lejos de esta.

  • Gente con altos ingresos (>150000) celebra esta fiesta en casa tambien.

  • Algo interesante que obtenemos de los datos es que la gente con bajos ingresos, por debajo de 150000, esta mas habituada a salir fuera de la ciudad pero no tan lejos como la gente con ingresos altos. Esto puede ser porque la mayoria de ellos viven lejos de sus familias y tienen que viajar para visitarles.

7) Relacionando Amistad y edad. Friendsgiving una version de la fiesta para jovenes.¶

In [23]:
dataset.pivot_table(
    index = 'Have you ever tried to meet up with hometown friends on Thanksgiving night?',
    columns = 'Have you ever attended a "Friendsgiving?"',
    values = 'int_age'
)
Out[23]:
Have you ever attended a “Friendsgiving?” No Yes
Have you ever tried to meet up with hometown friends on Thanksgiving night?
No 42.283702 37.010526
Yes 41.475410 33.976744
In [24]:
dataset.pivot_table(
    index = 'Have you ever tried to meet up with hometown friends on Thanksgiving night?',
    columns = 'Have you ever attended a "Friendsgiving?"',
    values = 'int_income'
)
Out[24]:
Have you ever attended a “Friendsgiving?” No Yes
Have you ever tried to meet up with hometown friends on Thanksgiving night?
No 78914.549654 72894.736842
Yes 78750.000000 66019.736842

Conclusiones:¶

  • Como podemos ver, la gente joven es mas proclive a celebrar accion de gracias que los mas mayores.
In [ ]:
 

Published

mar. 3, 2017

Category

Python

Tags

  • Data Manipulation 1
  • Python 10

Stay in Touch

Get Monthly Updates

  • Powered by Pelican. Theme: Elegant by Talha Mansoor